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The 4004 design team 
tells Us story. 



T"wenty-five years ago, in November 
1971, an advertisement appeared in 
Electronic News: "Announcing a new 
era in integrated electronics, a micropro- 
grammable computer on a chip." The ad was 
placed by Intel Corporation of Santa Clara, 
California, then just over three years old. From 
that modest but prophetic beginning, the 
microprocessor market has grown into a 
multibillion-doUar business, and Intel has 
maintained a leadership position, particularly 
in microprocessors for personal computers. 

In 1968, Bob Noyce and Gordon Moore, 
who had both just left Fairchild Semicon- 
ductor, founded Intel Corporation, and oper- 
ations began in September of the same year 
The new company was committed to devel- 
oping semiconductor mainframe memory 
products using both bipolar and MOS (metal- 
oxide-semiconductor) technologies. Bipolar 
processes offered faster access times, while 
MOS processes promised higher chip com- 
plexity — that is, more memory bits per chip. 
Rather than use the established technologies 
of tlie day, Intel was determined to use new 
bipolar and MOS processes similar to those 
Fairchild Semiconductor had just developed. 
For the MOS products, Intel chose a self- 
aligned P-channel silicon-gate process. 

Intel intended to produce proprietary 
memory products, rather than a specific 
product for each customer. Though this 
strategy offered high potential sales volume, 
it increased the design time. To optimize its 
revenue stream, therefore, Intel remained 
open to limited custom work, hoping that 
customers would be ready to use the prod- 
ucts as soon as they were working. The 
company did not project custom products 
to reach the high sales volumes it expected 
of the proprietary products, but it hoped 
they would provide an important source of 
revenue until the memory products were 
established. 



Bosicom 

In April of 1969, Intel agreed to develop a 
set of calculator chips for a Japanese firm. 
The firm consisted of two companies: 
Electro-Technical Industries handled prod- 
uct development, and Nippon Calculating 
Machines Company handled marketing. The 
calculators bore the brand name Busicom. 
Busicom intended to use the chip set in sev- 
eral different models of calculators, from a 
low-end desktop printing calculator to cal- 
culator-like office machines such as .billing 
machines, teller machines, and cash regis- 
ters. The firm made arrangements for three 
of its engineers to come to Intel to finish the 
logic design for the calculator chips and to 
work with Intel personnel to transfer the 
designs into silicon. The three engineers from 
Japan — ^Masatoshi Shima and his colleagues 
Masuda and Takayama — arrived in late June. 

Intel assigned Marcian E. HoffJr. to act as 
liaison to the Japanese engineers. Hoff had 
received his PhD from Stanford University 
in 1962 and had remained there as a 
research associate working on electronic 
neural networks until he joined Intel in 
September I968. At Stanford, Hoff had pro- 
grammed and built hardware interfaces for 
IBM model 1620 and 1130 computers. He 
was Intel's twelfth employee and received 
the title manager of applications research. 

Hoff s duties were to help define Intel 
products, meeting with ciistomers and mar- 
keting personnel as necessary. In addition, 
as Intel products became available, he 
would develop applications information to 
help customers use those products. 
However, because in early I969 the prod- 
ucts were still in development, and there 
were limited opportunities to question 
potential customers, Hoff had taken on sev- 
eral tasks peripheral to his primary duties. 

Although Hoff was only supposed to act 
as liaison to the Busicom engineering team, 
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This figure shows the 4004 CPU and its main 4-bit mul- 
tiplexed, tiistate bus driving up to 16 ROM (4001) and up 
to 16 RAM (4002) chips. The 4004 sends six other control 
signals to alithe 4001 and 4002 chips. Each 4002 contains 
four registers of 20 nibbles (l6+4 x 4 bits) each. It also 
contains a 4-bit output port. Each 4001 contains 256 bytes 
of ROM and a 4-bit ;I/0 port; ROM and ports are metal- 
mask programmable. The 4003 is a 10-bit serial-in, paral- 
lel-out and serial-out shift register used for keyboard 
scanning, printer control, and so on. The output ports of 



the 4001/4002 drive the 4003. 

All the chips are packaged in l6-lead DIPs. The 4000 chip 
set can have up to 4 Kbytes of ROM (sixteen 4001 chips), 
1,280 nibbles of RAM (sixteen 4002 chips), 32 directly 
addressable 4-bit I/O ports, and an unlimited number of out- 
put ports via the 4003- The addition of a few external gates 
doubles the amount of addressable ROM or RAM. (The RAM 
chip can store only data, not instructions.) The minimum 
system configuration consists of one 4004 (CPU) and one 
4001 (ROM with 4-bit I/O port). 



curiosity about the calculator led him to study the design. 
His first reaction vs'as surprise at how complex the calcula- 
tor logic ■was, particularly when compared to the general- 
purpose digital computers he had used. In addition, the 
interconnections between the various chips were extensive, 
requiring large and expensive packages. Having attended 
several meetings on the project's cost objectives, Hoff 
became concerned that the packaging requirements alone 
might prevent Intel from meeting those objectives. 

Busicom had proposed a ROM-based, macroinstruction 
programmable decimal computer consisting of seven differ- 
ent LSI chips: program control, decimal arithmetic unit, tim- 
ing logic, ROM, shift register, printer control, and output 
ports. Busicom had already successfully implemented this 



design in commercial products since 1968, using transistor- 
transistor logic (TTL) and ROM. 

Hoff expressed some of his concerns about packaging and 
design complexity to Intel's upper management — designing 
that many chips could be a daunting task for the limited chip 
design staff. Bob Noyce particularly encouraged him to pur- 
sue an alternative design if one appeared feasible. 

Hoff was initially reluctant to deviate too far from the orig- 
inal Busicom design. While some aspects of the proposed 
chip set were similar to those of other calculators of the day, 
it also included some novel capabilities. Most notable were 
the use of ROM for macroinstruction storage and a special- 
ized instruction set that would allow various calculator- 
intensive machines to use the same chips. Another innovative 
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Timing diagram 

A single -1 5-V power supply powers the P-channel MOS 
chip set. (Alternatively,, it could use -12 V and +5 V to allow 
TTL compatibility, a very: useful feature in the design of 
microcomputer development tools.) It uses two interleaved 
clocks, (]), and (Jjj as shown. The clock frequency, is 750 
.kHz, and the ;instraction cycle requires, 8 or 16 clockcycles, 
depending on tlie instruction. 

The 4004 generates a synchronizing signal, SYNC, which 
"marks tlie beginning of the instruction cycle and which' the 



of the 4000 series 

' 4004 sends to all the 4001 -and 4002 chips, iburing the first 
. three clock cycles, the -4004 sends' a- 12-bit ,address to the 
■ROM chips; the selected 4001- outputs its 8-bit instruction 
opcode in the next two clock periods.' The :4004'inteiprets 
and executes the instructi.o.n .duj-ing the ne^xt three clock 
periods. A few instructions— like Jump; for example — 
require two bytes and execute' in 16 clock pi^iods. Tlie typ- 
ical instruction execution time is 10.8 \is, or -21 .6 |J,s for the 
double-length instaictions. 



feature was the variable amount of shift register memory that 
the design could use, vv^ith the different calculator models 
having different numbers of memory registers. 

Like many calculators of the day, Busicom's design used 
shift registers for memory. Shift registers are quite fast for 
the arithmetic calculations, display, and printing that calcu- 
lators require, but are slow for operations requiring random 
access. Shift registers used six transistors per bit, like static 
RAM, but a shift register cell was smaller than the RAM cell. 
The shift register's size advantage, however, -was offset by 
increased control logic complexity and slower speed for ran- 
dom access. Any access to even a portion of a memory reg- 
ister required a complete scan through that register Such 
slow memory access would make a conventional CPU archi- 
tecture too slow to be practical. 

The 4004 is conceived 

Intel had just begun working on dynamic MOS RAM, using 
a three-transistor cell. Hoff, aware of that development, 
thought that if he could solve its refresh problem in the cal- 
culator environment, the DRAM would be an ideal alterna- 
tive to the shift register memory. Unlike the shift register, the 
DRAM could be accessed in as small a quantum as desired. 
In addition, the three-transistor DRAM cell used even less 



silicon area than the shift register cell. 

One of the first modifications to the Busicom design Hoff 
considered was adding Subroutine capability to the instruc- 
tion set. Subroutines of simple instructions could replace 
more complex instructions, which should allow simplifica- 
tion of the hardwired logic. Although the Busicom engineers 
appeared unreceptive to Hoff's proposals, with Noyce's 
encouragement, Hoff continued exploring options. 

Hoff began to consider the design of a general-purpose 
computer that might be programmed to perform calculator 
functions. The computer 'would fetch program instructions 
from a ROM into an arithmetic chip. The arithmetic chip 
would interpret the instructions, reading and writing as nec- 
essary to DRAM chips. The arithmetic chip would also have 
local "scratch-pad" registers. During the time the arithmetic 
chip was fetching program instructions from ROM, the 
DRAMs could be refreshed, since no instruction execution 
would be occurring at that time. The data quantum would be 
4 bits to allow binary-coded decimal (BCD) arithmetic. 

Hoff performed these studies of a general architecture in 
July and August of 1969. During this time, he believed the 
Busicom team was unresponsive to his idea. On the con- 
trary, Busicom engineers recognized that Hoff's proposal of 
a general-purpose CPU was more advantageous than their 
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Block diagram 

: .The- 4004 Cg^ 1 6 general-purpose 4^bit registers; 

one '4-1311 accumulator; a four-level, 12-bit push-down 
address i stack containing the program counter 'and three 
return addresses for subroutine nesting; a binary and dec- 
irnaf arithmetic unit; instruction register, decoder, and con- 
trol logic; timing; logic;: bus control; and miscellaneous 
control. 

In addition to the pins: required for the 4-bit tristate data 
bus, two-phase clock, power, and ground, the 400-4 has a 
SYNC timing output pin and five control lines for address- 
ing the 4001 and 4002 chips (GM ROM and CM RAMO'3; 



ofthe4004CPU 

"CM" indicatescommarid). There is also a reset pin to ini- 
tialize the system and a test input pin. Test provides one 
of the conditions in the conditional jump instruction OCN), 
allowing the 4004 to poll external devices. Later generation 
processors replaced Test with a much more convenient 
interrupt facility. 

Although several prior computer architectures inspired 
the 4004 (PDP-8, IBM 1620, and so on), it is unique in; 
many aspects. Its main value resides in its simplicity and 
economy of means— essential ihgredients, given the lim- 
ited capabilities of 1970 senliconductor techftology, - 



design. The concept was still incomplete, however, and 
required additional features to function satisfactorily in 
Busicom's products. Certain calculator functions, such as 
decimal adjust and keyboard processing, required too many 
bytes of ROM, and there w^as no mechanism for real-time 
Control to synchronize the CPU with external events. Also, 
the RAM chip*^s organization did not seem well suited for 
storing the decimal position, sign, and other data necessary 
for calculating a decimal^ string of digits. 



In September 1969, Stanley Mazor joined Intel from 
Fairchild, where he had been since 1964, and where he had 
helped design the Symbol computer. After Mazor arrived, 
progress began to accelerate. 

Working together, Mazor and Hoff further refined the idea 
of a general-purpose design and demonstrated its potential 
capabilities, addressing the objections raised by the Busicom 
team. In response to the Busicom engineers' need for 
macroinstruction capability, Mazor proposed adding a Fetch 
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4004 instruction set 

The 4004 has 16 instriiction types. 
The main group contains 14 general 
instructions: register instructions, con- 
ditional atid unconditional jumps, 
increment and skip if zero, and so on. 

The I/O and RAM group includes 15 
types. RAM data addressing rn the 4000 
architecture works as follows. First, the 
DCL instruction selects a group of four 
■RAM chips out of four groups. Theti, 
the- SRC instruction selects one 4-bit 
nibble out of the 256 regular nibbles 
.contained in the selected group of four 
chips. Finally, one of the I/O and RAM 
instruction types performs. the opera- 
tion on the selected data. Each RAM 
chip contains 64 regular nibbles and 16 
status nibbles -(hence 80 nibbles); the 
I/O and RAIM instructions address the 
status nibbles directly. 

Finally the accumulator group 
includes 11 binary arithmetic instruc- 
tion types, ii decimal-adjust instruction, 
a code conversion instruction (KBP, 
keyboard' process) ' that reduces the 
number of ROM bytes required for the 
keyboard^scanning operation, and a 
memoiy control instruction (DCL). Ivlost 
of the instructions execute in 8 clock 
cycles (10;8 ps). 

Note3: : - ' . 

(1) The condition code is assigned as follows: 

C^ = i Invert jump condition 
Ci - Ntot invert 'jump condition 

■ C2 * 1 Juiap if accumulator is 
Cj * 1 Jtuiip if oany/link is a 1 ■ 

• C, - 1 Jump if test signal is . a- . 

(2) t?jRR is the address of one of eight indes 
register pairs ki the CPU. 

(3) SSSR is the" address of one of 16 index 
registers in ttri CPU. - . ' . 

(4) Each RAM (Up has four registers, each with 
twenty 4-fe>i£'Charaa-etsstibdi\'ided into 16 main :■ 
mernorj' characters and four status characters. 
Chip number, EAM-register, and main memory ■ 
character are addressed by an SRC ihstnaction. For 
the selected chip and register, however, status 
cfiaracter IbcaSons i(re selected- by the instruction' 
codeCOPA):.' ■ . .. 



MCS-4"" Instruction Set 

[Those instructions preceded by an asterisk (*) are 2 word instructions that occupy 2 successive locations in HUM)" 
MACHINE INSTRUCTIONS 
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Jump 10 ROM address Aj A2 Aj Aj, A-, A^ A^ A^ (within the same 
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\i true, oTherwise skip (go to the ne«t insirualon in sequence). 
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Fetch immodlate (diVect) from ROM Data O2, D, to index register pair 
location RRR,(21 
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Send register control. Send the addrew (conteiits of index register pair RRR) 
to ROM and RAM at X2 and X3 time in tKe Instruction Cycle. 


FIN 


11 


R R R 


Fetcf> indirect from ROM, Send contents of index register pair legation 
out as an address. Data fetched is placed into register pair locaiion RRR at 
Al and A2 linie in tha Instruction Cycle. 


JIN 


11 


R R H 1 


Jump Indirect. Send contents of register pair RRR out aj an address 
at Al and Aj tirrw in the Instruction Cycle. 


■JUN 


10 
A2 A2 Aj A2 


A3 A3 A3 A3 
A, A, A, A, 


Jump unconditional to ROM address A3, A2, Ai. 


•JME 


10 1 
A2 A2 A2 A2 


A, Ai A, A, 


Jump to subroutine ROM address A3, Aj. Ai, saw old address. (Uo 1 l&uel ' 
in stach.l 


INC 


110 


R R R fl 


Increment contents of register RRRR. '"*' 


■isz 


111 

A2A2A2A2 


R R R R 
A, A, A., A., 


Increment contents oi register RRRR. Go 10 ROM acJdrasi A2, Ai 
[within the san-« ROM that contains this ISZ instruction) if result ^0, 
Otherwise skip (go to the next instruction in sequence). 


ADD 


10 


R R R R 


Add contents of register RRRR to accumulator with carry. 


SUB 


10 1 


R R R R 


Subtract contents of register RRRR to accumulator with borrow. 


LD 
XCH 


10 10 


R R R R 


Load contents of register RRRR to accumulator. 


10 11 


R R R R 


Exchange contents of Index register RRRR and accumulator. 
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110 


D D D D 


Branch back (down 1 level in stack) and load data GOOD to accumulator. 


LDM 


110 1 


D D D 


Load data DDDD to accumulator. 



IMPUT/OUTPUT AMD RAM INSTRUCTIONS 

The RAfW's and ROM's operated on in the I/O and RAM instruc 


ions haue been previously selected by iJie last SRC instruciion executed.) 
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Write ifie conients of itte accumulator into tfie Dreviousiy selected 

RAM main memory character , 
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Write the contents ot iHe accumulator inta the previously selected 
RAM outpyi port. (Output Lmes) 
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10 


Write the contents ot ihe accumulator into the previously setecied 
ROM output port. (1/0 L-nes) 
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10 


Write the contents of the accumulator into the previously selected 
RAM status Character 0. 
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10 1 


Wriie the contents of the accumulator into the previously selected 
RAM status Character 1. 
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1110 


Olio 


Write the contents of the accumulator into (he previously selected 
RAM status character 2. 
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111 


Write the contents of the accumulator into the previously selected 
RAM status character 3. 
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10 


Siibtfacl the preuiooslv selected RAM main memory character from 
accumulator with borrow. 


BOM 
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10 1 


Read Ihe previously selected RAM main memory character 
into the accumulator. 
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Read the contents of the previously selected BOM mput port 

into (he accumulator. (1/0 Lines) 
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Add the prsviouslv selected BAM mam memory character to 
accumulator with carry. 
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Read the previously selected RAM status character into accumulaioi. 


RDl'-*! 


1110 


1, 1 1 


Read the previously selected RAM status character 1 Into accumulator. 
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Read the previously selected RAM status character 2 into accumulaior. 
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Read Che previously selected RAM status character 3 into accumulator. 
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Clear both, (Accumulator and carry) 
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Clear carry. 
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10 


Increment accumulator. 
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Complement carry. 
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Complement accumulator. 
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1 I 


Rotate left, (Accumulator and carry) 


BAB 




110 


Rotate right. (Accumulator and carry! 
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111 


Transmit carry to accumulator and clear carry. 
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U„,.n,.„. .„.,„„,.,o,. 
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Transfer carry subtract and clear carry. 
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Set carry. 
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Decimal adjust accumulator. 
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Keyboard process. Converts the contents ot the accumulator from a 
one out o( tour code to a Omary code. 
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110 1 


Designate command hne 



Indirect instruction and coded a 20-byte interpreter to exe- 
cute 1-byte macroinstructions. Siiima, the Busicom engineer 
in charge of programming, further refined Mazor's interpreter 
In addition, Shima proposed including a conditional jump 
based on the status of an external input pin (test), adding an 



instraction that would simplify keyboard scanning, and mod- 
ifying the Branch Back instruction. 

It appeared at the time that a 1-MHz clock would be feasi- 
ble for the processor's logic. To allow the use of small, inex- 
pensive packages (I6 leads), the design could include 
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extensive multiplexing of intercon- 
necting lines. Four data transfer lines 
would permit the processor to trans- 
fer one 4-bit quantum each clock 
cycle. With a 12-bit program address 
and an 8-bit instruction, it would take 
five clock cycles to address and fetch, 
an instruction. Since most instructions 
would be simple, three cycles for exe- 
cution seemed adequate. Those tim- 
ing parameters and a 1-MHz clock 
would allow the processor to add 
multidigit BCD numbers at a rate of 
80 microseconds per digit. This speed 
was comparable to that of the IBM 
1620 computer Hoff had used in the 
early 1960s. 

By mid September, Intel marketing 
was sufficiently confident of the new 
approach to suggest it to Busicom 
management as an alternate to the 
original design. In October 1969, 
Intel held a formal meeting with the 
Japanese firm's management, who 
had come to the US to discuss the 
project. Intel presented both 
approaches, with Hoff and IVIazor 
aiguing that the Intel architecture was 
much more flexible than the original. 
Busicom's managers, appreciating 
the architecture's increased simplici- 
ty and flexibility, chose the Intel 
design, and Intel became committed 
to build the first single-chip comput- 
er CPU. The Busicom engineers 
returned to Japan, except Shima. He 
stayed on at Intel until December to 
develop many of the calculator's key 
software programs, which he based 
on the new architecture and its 
instruction set. 

When the companies signed a 
development contract, Hoff was dis- 
appointed to learn that, although Intel had developed the 
architecture and it differed markedly from the original, the 
contract gave exclusive rights to Busicom. Intel marketing 
explained that the project would not have proceeded with- 
out that concession. 

Intel was now committed to develop the chips for the new 
architecture, but the company had a staffing problem. Neither 
Hoff nor Mazor had designed chips, and the proposed chips' 
complexity would require someone with extensive experi- 
ence. Thus, the design would fall to a different department 
than Applications Research. Since MOS designers were in 
short supply, and all of those at Intel were already commit- 
ted to memory projects, Intel would have to recruit someone 
to take over the project's logic and circuit design and the sil- 
icon implementation phase. That process would take months. 

In the meantime, Hoff and Mazor had responsibility for gen- 



The 4004 chip 

The 40O4 is the first example of a complex random-logic circuit built using sil- 
icon-gate MOS technology. Silicon gate was essential in obtaining the small size 
and the high speed (for the day) required by a general-purpose CPU. The chip 
measures 3.0 mmx4.0 mm and integrates approximately 2,300 transistors. 

Under Pedeiico Faggin's direction^ three layout draftsmen drew the composite 
■ layout offiie 4004 using colored lead pencils on mylar at 500 times the actual scale. 
The composite layout translated the abstract circuit diagram into the actual geom- 
etry of the transistors and their interconnections. Showing all the masking layers 
required for pfoeessing, the layout served as a template for the preparation of the 
"rubies." A rubylith consists of a mylar sheet with a thin layer of semitransparent, 
red material that can be cut and peeled off. The composite layout, placed under- 
neath the niby, guided the cutting and peeling operations. One ruby was prepared 
for each nask layer required in the wafer processing. The 4004 required six lay- 
ers, including the scratch-protection layer; the other chips in the set required five. 
The ruby was then photoreduced to 10 times the 4004's actual scale to prepare the 
reticle. The reticle, in turn, was used to create the actual scale mask via a step-and- 
repeat optical process. 




crating applications information for the memory products that 
Intel was adding to its product line. One of the more success- 
ful memory products was a line of shift registers that quickly 
found a market in CRT (cathode-ray tube) computer terminals. 

One of the customers for shift registers was Computer 
Terminals Corporation of San Antonio, Texas. In December 
1969, an officer of that company inquired if Intel could mod- 
ify an existing Intel static RAM (the 13101, a bipolar 64-bit 
RAM) to create a 4xl6 stack memory for an intelligent ter- 
minal CTC was designing, the Datapoint 2200. 

Mazor and Hoff studied the request and determined that the 
CTC processor did not appear much more complicated than the 
proposed 4004. They concluded that it would be feasible to 
make a single-chip, 8-bit microprocessor. They drew up a tar- 
get specification, and CTC contracted with Intel for the devel- 
opment of what would be Intel's second microprocessor. By 
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The Busicom calculator 

Shown here is rhe engineering prototype of the Busieoin 
culcLilator tliaL opened the niicroproeesior apjDlicaiion floocl- 
gau^s. This M-duvt. iloaling- .^nd ilxed-poinl, jniiitiuj; cal- 
culator, completed in April 1 9" I, .had tiicmory and an 
optional .sqLiare-root fnnetion. It was designed, built, and 
itiarkelcd l.iy [iu.siconi in Japan, inlel's -'lOOO .series pcrl"ormc:d 
all the electrc;nic ftincticm.S:of tliis calculator, except the di.s- 
crere-tran.si.sror printer-drivei" eircuirry, the clock geneiator, 
and jui.scellaneous lamp driver;;. In all, the calctilator used 
one 4004 CPU, four 4-001 ROM cliipK. two 4002 IL\M cliips. 
and thn-te --i-OO.t 'iliilf register c:l"iip.s. (Tin-: model witli ihe 
fiC|uare-rooL funchon used. one .^uldiiioiial 4001.) Bu.sicom 
sold tliib calculator worldwide beginning in July 1971. 




early 1970, Intel was committed to produce two different sin- 
gle-chip computers, and still had no staff to do the design. 

The design of the 4004 

Early in 1970, Leslie Vadasz, who headed Intel's MOS design 
group, announced he had found someone to do the design of 
the calculator chip set: Federico Faggin. Faggin v/orked at 
Fairchild, where he and Tom Klein had developed the origi- 
nal IMOS silicon-gate process in 1968. He had also designed 
the first commercial circuit to use that technology (the 3708, an 
8-bit analog multiplexer with decoding logic). Faggin also had 
experience with computer design, having codesigned and built 
a small computer for Olivetti in his homeland, Italy, in 1961. 

Faggin joined Intel in April 1970, as the engineer in charge 
of the design of the calculator set. Internally called the 4000 
family, the set consisted of four chips: the ROM program mem- 
ory (4001), the RAM register memory (4002), an I/O expan- 
sion shift register chip (4003), and the CPU (4004). A couple 
of days after Faggin joined Intel, Shima arrived from Japan to 
check on the project's progress. Shima was very disappointed 
that no progress had been made since he left Intel in December 
1969; according to him, the schedule for the project had been 
irreparably compromised. Because of this delay, Faggin began 
work at a furious pace, often far into the early morning hours, 
to make up as much of the lost time as possible. Shima stayed 
at Intel for six months to help Faggin with the project. 

After resolving the few remaining architectural issues, 
Faggin laid down the foundations of the design methodolo- 
gy he was going to use for the chips. Random logic design 
with silicon-gate technology required a different methodol- 
ogy than metal-gate technology, and no one had ever 
designed a circuit of the 4004's complexity. 

An important element of Faggin's methodology was its use 
of bootstrap loads. These circuits provided faster output volt- 
age swings, switching to the full supply voltage instead of the 
supply voltage minus the transistor threshold voltage (aug- 
mented by the body effect). Bootstrap loads allowed him to 
use pass transistors, simplifying the circuit design and reduc- 
ing the number of transistors necessary to perform the required 
logic functions. In those days, it was common belief that boot- 
strap loads were not feasible with silicon-gate technology, 
unless the design incorporated an additional masking step. 



Faggin, however, had figured out how to make bootstrap loads 
■without modifying the process architecture. This circuit trick 
was essential to achieve the necessary speed and density with- 
out exceeding the power budget. 

Faggin was also happy to find that Intel had adopted the 
"buried-contact" design. This technique, similar to the one he 
had developed at Fairchild, permitted direct connections 
between the polysilicon layer and the difiusion layer, allov/- 
ing higher circuit densities. The buried contact was essential to 
achieve a manufacturable chip size for the 4004. 

Faggin decided to design the 4001 first, followed by the 
4003, the 4002, and finally the 4004. In those days, there was 
little automation of the design process. Although Intel had 
access to a time-shared mainframe computer for critical cir- 
cuit simulation (via a 10-characters-per-second teletype), the 
company discouraged Faggin from using it because of its 
cost. So, Faggin did most of the circuit design with a slide mle 
and using graphical analysis based on measured static and 
dynamic transistor characteristics. 

Designing a production integrated circuit took many steps, 
starting with the definition of the chip architecture and its 
basic specifications. For the 4000 set, Hoff and Mazor com- 
pleted these initial steps, with contributions by Shima and the 
other Busicom engineers. Next came the logic design, circuit 
design, layout design, ruby-cutting, mask making, wafer pro- 
cessing, chip verification and debugging, characterization, 
production test-pattern devielopment, and transfer to manu- 
facturing. The entire process, starting from the logic design 
and ending with working samples, would take a minimum of 
six months for a simple chip, longer for a complex one. 

At the peak of the project, Faggin and Shima worked simul- 
taneously on all four chips at different stages of the develop- 
ment process. The 4004's detailed logic design, which Shima 
undertook, took place during June and July. Shima also did the 
logic simulation, while Faggin concentrated on the circuit 
design, layout, and overall supervision of the project. 

The 4004 comes to life 

Intel processed the first silicon wafers of the 4001 in 
October 1970, and Faggin found the circuit fully functional. 
In preparation for receiving the chips, Shima returned to 
Japan to complete writing the software and to build the engi- 
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neering prototype of the Busicom calculator. In November, 
4003 and 4002 wafers came out of the processing line. The 

4003 was fully functional, and the 4002 had a minor prob- 
lem that was soon diagnosed and corrected. 

Finally, at the end of December the big day arrived; Faggin 
received the first 4004 wafers, less than nine months after he 
had begun the project. Faggin's hands were trembling as he 
loaded the first wafer in the wafer prober to begin the test, 
and as he probed around the 4O04, he found absolutely no 
life. He couldn't believe his eyes. Within half an hour, \iaw- 
ever — the longest half hour of his life — Faggin found that 
one masking step (the buried-contact layer) had been left 
out during Tvafer processing. This manufacturing problem 
explained why the 4004 was dead. 

It wasn't until January 1971 that Intel processed a new run 
of 4004 wafers. Faggin received the wafers in the evening 
and, alone in the lab, tested them through most of the night. 
This time, everything worked as expected. That was the night 
the 4004 was born. 

During the following days, Faggin continued verifying the 

4004 and found two minor bugs that were soon diagnosed 
and corrected. As a result, he achieved fully functional 4004s 
on the next mask iteration, in March 1971. 

After thoroughly testing the 4004, Faggin sent several samples 
to Busicom, where Shima was testing the calculator and debug- 
ging his software using a RAM-based emulator for the 4001. 

The Busicom calculator used one 4004, two 4002, three 4003, 
and four 4001 chips; it used an additional 4001 for the option- 
al square-root function. In other words, the system consisted 
of a 4-bit CPU running approximately 100,000 instructions per 
second, with 1 Kbyte of ROM, 80 bytes of RAM, and approxi- 
mately 50 I/O lines. Today, using 0.35-micron lithography, the 
most advanced manufacturing technology in production, these 
functions, without the bonding pads, would occupy less than 
one tenth of a square millimeter. (Incidentally, the manufac- 
turing cost would now be approximately half a cent.) 

In April, word came that the Busicom calculator ^vas fully 
functional. That was the final and essential proof that all the 
chips 'were working properly, individually and as a system. 
That same month, Shima sent Intel the final ROM patterns to 
generate the custom metal masks for the calculator's four 
4001s. This was the last step preceding volume production, 
which was to start in June. 

Finishing touches 

During the 4004 characterization, which began in March, 
Faggin observed a very disturbing phenomenon; At high tem- 
perature some 4004s were occasionally faiUng, but when he 
tested them again, they would pass. This problem was mad- 
dening, because the lack of repeatability and the lack of diag- 
nostic tools made it very difficult to find the reason for such 
elusive failures. It took a few days to conclusively determine 
that the problem was caused by the corruption of some of 
the data stored in the DRAM registers. However, Faggin was 
at a loss to understand the mechanism responsible for it. 

After a tense week of tests and analysis, however, he traced 
the problem to a weakness in the RAM decoder's design, 
which caused the injection of minority carriers in the sub- 
strate to leak away the electrical charge stored in the DRAM 



cell. (Intel had avoided a similar problem in its standard 
DRAM components by using an additional substrate bias, not 
desirable in the 4000 series.) Once Faggin understood the 
problem, he soon found a solution. Fortunately, there was 
enough room in the chip to make the necessary modifica- 
tions to the decoder without a major redesign. 

Faggin was surprised that no similar problem had ever been 
observed in the 4002, which had the same decoder design as 
the 4004. To make sure, Faggin created a special test sequence 
to see if the 4002 would also fail under properly adverse con- 
ditions. Indeed, the 4002 demonstrated such failures, vali- 
dating the problematic-decoder hypothesis and leading 
Faggin to change its design in the 4002 as well. These were 
the last steps to ensure that the company would manufacture 
a quality product, averting potential problems in the field. 
Production could then start in earnest, and by August 1971, 
the 4000 series became a major source of revenue for Intel. 

Marketing the 4004 

When Faggin found that the 4000 chip set was exclusive 
to Busicom, he ivas very disappointed because he sa^v the 
set's market potential reaching far beyond calculators. 
Though he started lobbying management to obtain the rights 
to sell the 4000 series to the general market, the sentiment 
at Intel was that the 4004 was good mostly for calculators 
and calculator-like products. In an effort to prove otherwise, 
Faggin decided to use the 4004 as the controller for the 4004 
production tester he was designing. Conveniently, he vi^as 
able to load the software into the new EPROM devices (elec- 
trically programmable, read-only memory, just invented at 
Intel by Dov Frohman-Bentchkovsky), instead of the mask- 
programmable 4001s. After successfully completing this pro- 
ject, Faggin used the example to prove to management that 
the 4004 was quite useful and thus marketable for applica- 
tions besides calculators. 

Hoff later found that the 4004 simplified the design of a unit 
for programming the EPROM devices while providing the abil- 
ity for rapid upgrades. Because EPROM promised to be ideal 
for holding programs for the single-chip computers, Hoff and 
his group developed a circuit board containing interface circuits 
that would allow the EPROM to substitute for the 4001 ROM. 
Later, Intel developed a similar board for the 8-bit processor. 

One day, talking over the phone with Shima in Japan, 
Faggin discovered that Busicom was having financial prob- 
lems. To be more competitive in the marketplace, the firm 
needed lower prices for the chip set. Faggin and Hoff then 
pleaded with Noyce and marketing that Intel give a price 
concession to Busicom in exchange for nonexclusivity. By 
May 1971, Intel had obtained the right to sell the calculator 
chips to others, except for desktop calculator applications. 

A brief new product announcement in Datamation mag- 
azine mentioned the chip series. However, even^ith the 
limited rights to sell the chips to other companies, Intel man- 
agement was reluctant to announce the microprocessors offi- 
cially. Marketing had deep concerns about the field sales 
staffs ability to properly support such complex products. 
Intel was developing a good reputation based on its mem- 
ory products and its support of them, and marketing did not 
want to risk that reputation. 
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Marketing the early microprocessors 

Hank Smith 



'I'lic; first public annoi.inccincnt of a microprocessor l^y 
Intel in November 1971 was realK- a turning point in the his- 
torv of tlie electronics industry and has continued to pro- 
loiindly atfeet all otir daily lives. Tlic history cit the first 
microproccsKpi" design and; development team- -consisting 
of Federico Faggin, Ted Hoff, Stan Msior, and Masatoslii 
Shinia-is now vv-ell documented. But there was another 
integral part of the team whose contributioti to the early 
succes.s ol' the laicroproeessiir has generally been ignored, 
and that was the jnark'c^ting group. So, when l-'aggin asked 
me to describe the early days tif marketing the micro- 
processor, I was delighted because so little has fieen "writ- 
ten about this important part of microprocessor history. 

Very early on, we realized that tlie microprocessor was 
\'ery differeiil Iran any other jjroduct Intel had iniroduced, 
and that we would have to market it \ery differently. Firsl, 
most engineers were unfamiliar, witfi programming and 
debugging siifi.w-are (paniculai'ly at tlie mat.hlne, and 
assembly level), vv'hich was going to he necessary in 
designing systems with this product. Second, we felt lliat 
we cotild build a grcinp of loyal cu.stomers because, once 
they designed applications using the riticroprocessor, .their 
significant softwaie iiivestmciii \voLild keep them from 
changing products or suppliei'S. Our priiiiaiy objective was 
to get companies to design Intel's microprocessor into as 
many applicatiC)ns as possible, and as quickly as pos.sible. 
To do this, we liad to ntake it as easy as pos.sible to use. 
Thus, we needed flexible, inexpensive design fools and 
devt'loprnent systems, and a groLip dedicated to the mar- 
keting and sales of the niicrojinx rssor. This, then, became 
the. mission of the firsi Microccjmputer Systems' Group, 
formed in April of 19~2. 

W'e were pioneers. No one had ever marketed a prod 
net like this before, and we had nculiing to gi-iide us. 
Eveiy'lhing"we did was a first, and we influenccLl the direc- 
tions of the industr\^ for many years with the design tools 
we introduced. 
■ We were first xvith the following: 

• Microcomputer Systeiv^ Croup. For the firs.i lime a 
scyniconductor conipany combinc>d hardware engi- 



nccj'ing, software development, manufacttiring, and 
marketing in a sin,gle marketing group. : 

• Comprehensive documentation and manuals. \X'e 
marketed each microprocessor chip as part of a seiies 
(the MCS-4 and ,VlCS-8 chip sets) and provided user's 
manuals and comprehensive documentation for using 
the products. 

• Sitnuiation (de/.'elop-ment) board.';. The Sim4-01 and 
Si[jiS-01 were generaf purpose mic-jx)cc;mpulei modules 
that cirstomers c-ould use for cievelopmenr, preproduc- 
tioii, and small i'.)n"jduction runs of ■mia'oprocessor-based 
[iroduttS; 

• ■ .I'-'J/M high-level language. PLAM was originally devel- 

oped for the 8008 set byGar\' Kildall. This language 
made it possible to write a program ortce and, by corn 
. piling it. .have it ruji on all different kinds cjf 8008 and 
8(.i80 products and' systems. We prcivided a compiler, 
cross assembler, and sinuilaLor all writien in F(;itran 
IV, that could ran on a general-purpose computer or 
one of our development systeriis. : 

• Jntellec developm.ent sy.stem. The Ijitellec-4 and 
Intellcc-8 dcxclopmetit .syst0ms were self-contained, 
expandable systems complete with CPU, rfiemory. 
I/O. clock, TTY inte:rface, powe.r suj-iply, coittnil an.d 
display modules, and standard software. These' sys- 
tems, which could be programmed in FI/M, wereu-eal- 
ly the fcirerrinners of the more sophisticated M1?)'S 
development systems, and the personal cc^mputer ; 

' Mo one could have forecast the microprfjcessor's unbe- 
lievable success, and 1 feel very fprtitnate to have been an 
irnporlanf' part of the original team responsible for the 
latmch and success of this revolutionary product. 

AJit'r leaving Intel. .Hank Smith spent 15 years in the 
venture capital industry as a general partner .o] Venrock 
Associates. He currently lives In Woodstock, Vermont, where 
he is an independeyit investor. He owns an- antique car- 
restoration bu.'iiness and a. horse farm, and is a. principal 
owner of the Xoiivich Mavigators, a AA minor league base- 
ball team af/Hialed with the Neii) York Yankees. 



Another concern, one shared by Hoff and iVIazor, was that 
customers accustomed to the power of minicomputers would 
be unable to adapt to the microprocessors' poorer perfor- 
mance. However, both felt that proper presentation would 
prepare customers for the limitations, and that micro- 
processors would still find many uses. 

In the summer of 1971, major changes in Intel's market- 
ing department brought in a new vice president of market- 
ing — Ed Gelbach, formerly with Texas Instruments. Ed was 
much braver than his predecessors, and he arranged the for- 
mal announcement of the 4004 in November of 1971. Hank 
Smith, working for Gelbach, became the first microcomput- 
er marketing manager. 



Market reactions . 

Intel had changed the chip set's name from 4000 to iVICS- 
4, for iVticro Computer System 4-bit; the response to the 
announcement of this first microprocessor was very encour- 
aging. JMarketing worked with Hoff, Faggin, Mazor, and Hal 
Feeney to provide support. The support items included data 
sheets with application information, user manuals, and print- 
ed circuit boards. IVIarketing released literature that also 
revealed the coming 8-bit processor, which Intel officially 
announced in April 1972 under the name 8008 (twice the 
4004!) as the core of the iVICS-8 series. 

The 8008 could actually have been the world's first micro- 
processor A few weeks before Intel hired Faggin to design 
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the 4000 set, Hal Feeney had joined Intel to work on the 8- 
bit microprocessor for CTC. Feeney worked with Mazor and 
CTC to complete the specifications for the chip — internally 
called the 1201 — and to modify the CTC architecture as nec- 
essary for silicon implementation. Financial problems at CTC, 
however, soon reduced the 1201's priority, so Feeney was 
diverted to other projects. Other potential customers kept 
the project alive, but the design did not proceed much past 
the first few months of work. 

The CTC project remained dormant until January 1971, when 
Intel reassigned Feeney, now working under Faggin's super- 
vision, to the project. The designers' recent experience with 
the 4004 provided a proven design methodology that paved the 
way for the 8008. Feeney did the detailed design of the 8008, 
and by March 1972, Intel was producing working chips. 

Ironically, during 1970, CTC had also contracted with 
Texas Instruments to design the same processor using TI's 
MOS aluminum-gate process. TI's chip, heralded in the tech- 
nical press in June 1971 as the first CPU on a chip, was more 
than twice the size of the 8008. CTC reported that it had 
never fully worked. 

Intel promoted both the 4004 and the 8008, and in May of 
1972, Hoff and Mazor presented several seminars around the 
country. The microprocessors generated much interest, and 
many of Intel's customers began to design products based on 
them. Of the two, the 4004 offered lower cost and a higher 
degree of integration for the resulting system, because the series 
offered RAM and ROM chips "with I/O capability on the same 
chip. The 8008 could address a larger memory space (up to l6 
Kbytes) and could use any mix of RAM or ROM for its memo- 
ry. However, the 8008 required some 20 standard TTL inte- 
grated circuits to provide the interface between the processor, 
memory, and I/O. While the 8008 instruction cycle was actu- 
ally somewhat slower than the 4004, most customers perceived 
it as the preferred processor for more complex applications. 

The 4004 and the 8008 became archetypes for today's two 
primary markets for microprocessors: embedded applications 
and user-programmable computers. Most microprocessors 
used in embedded applications are now integrated with the 
memory and the I/O functions — ^true single-chip computers. 
Thus, a low-cost, single chip can typically do all the work 
required in many simple control applications. Such devices 
are called microcontrollers. Simple 4-bit and 8-bit micro- 
controllers control microwave ovens and computer key- 
boards, for example, while sophisticated microcontrollers 
drive cellular phones and laser printers. 

Currently, the semiconductor industry manufactures a few 
billion microcontrollers worldwide per year. More than 50% 
of all microcontroller units manufactured in 1995 were still 
4-bit devices with capabilities equivalent to those of the MCS- 
4 set. Nonetheless, the more expensive 8-bit microcontrollers 
have the majority of the market dollar volume. 

The 8008's first application was a Seiko user-programma- 
ble, scientific icalculator, and soon the 8008 led to the per- 
sonal computer, the quintessential microprocessor 
application. In fact, many consider the personal computer's 
archetype to be the Micral, a French desktop computer using 
the 8008 CPU, sold in 1973. The 8008 evolved into Intel's 
8080, the first high-performance microprocessor, conceived 
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and designed by Faggin and Shima, wth architectural con- 
tributions by Mazor and Hoff. This evolution has continued 
on to the present Pentium Pro, with a new generation, on 
average, every three years. 



The personal computer has become an enormous 
market for microprocessors, and is considered by the pop- 
ular media the microprocessor's primary use. When Intel 
originally announced the microprocessor, however, Faggin, 
Hoff, and Mazor considered its primary market to be con- 
trol devices — applications now described as embedded con- 
trol. While main microprocessors for personal computers do 
indeed represent a large market, with tens of millions of units 
sold each year, many more microprocessors and microcon- 
trollers go into embedded control applications, with a typi- 
cal microcontroller costing between 304 and $10. 

From its modest beginning 25 years ago, the microproces- 
sor industry has grown to such an extent that nearly 70% of 
all semiconductors sold ■world'wide are either microproces- 
sors, microcontrollers, or other components used in conjunc- 
tion with them, such as memory and I/O devices. Since the 
worldwide sales of semiconductor components in 1995 was 
approximately $150 billion, this means that the market direct- 
ly related to the microprocessor is over $100 billion at OEM 
component prices. The market value of all the products incor- 
porating microprocessors is many times that figure, of 
course— a tmly staggering amount. 

Over the last 25 years, there has been an explosion of appli- 
cations. People carry microprocessors with them inside their 
watches, pocket calculators, organizers, and cellular phones; 
and microprocessors are all around them, in their homes, 
cars, offices, and laboratories. The microprocessor has 
improved the quality, cost, and functionality of traditional 
electronic equipment. But, most importantly, it has enabled 
literally thousands of new^ applications impossible before its 
advent. Amazingly, the pace of deployment of microproces- 
sors and microcontrollers in new applications is still going 
strong, and we expect it to continue for the foreseeable future. 
Without question, the microprocessor reality has far exceed- 
ed even the most bullish expectations of its creators. (P 
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